課程資訊
課程名稱
資料科學與商業分析
Data Science and Business Analytics 
開課學期
108-1 
授課對象
管理學院  管理學院企業管理碩士專班(GMBA)  
授課教師
王志軒 
課號
GMBA5028 
課程識別碼
749EU0210 
班次
 
學分
3.0 
全/半年
半年 
必/選修
選修 
上課時間
星期四2,3,4(9:10~12:10) 
上課地點
管一405 
備註
本課程以英語授課。本課程建議有統計學和基本編程能力者修習。
限GMBA班學位生
總人數上限:30人 
Ceiba 課程網頁
http://ceiba.ntu.edu.tw/1081GMBA5028_ 
課程簡介影片
 
核心能力關聯
核心能力與課程規劃關聯圖
課程大綱
為確保您我的權利,請尊重智慧財產權及不得非法影印
課程概述

This course will provide graduate students and upper division students an accessible and interdisciplinary introduction to data science and business analytics, including introduction of theories and R-programming implementation. The prerequisites for this course are statistics and basic programming. Students are required to take laptops in the classroom for practicing coding skills and handling real datasets. This course is expected to be applied to manufacturing & service sectors. Course loading is heavy, not the style of case oriented in-class discussing. Students are expected to emploit the skills they learned in class to conduct data-driven decision making & support. 

課程目標

The primary objective of this course will guide students to follow a PDCA (plan-do-check-action) loop in data science to solve real problems: defining your problem, selecting appropriate methods, evaluating their performance, and modifying the constructed models. In addition, the main objectives of this course are summarized as follows (interdisciplinary research interests):
1. Applying statistics to real problems (quality control),
2. Applying clustering skills to real problems (target marketing),
3. Applying classification skills to real problems (bankruptcy prediction),
4. Applying regression skills to real problems (demand forecasting)
5. Applying dimension-reduction skills to real problems (business intelligence).  
課程要求
1. Statistics (or Equivalent),
2. Basic programming (a must not a plus),
3. Database (not a must but a plus) 
預期每週課後學習時數
 
Office Hours
備註: Data Science, Business Analytics 
指定閱讀
1. Statistics (or Equivalent),
2. Basic programming (a must not a plus),
3. Database (not a must but a plus) 
參考書目

1. Introduction to data mining (textbook), Tan et al., Pearson.
2. Data mining and business analytics: Concepts, Techniques, and Applications in R, Shmueli et al., Wiley.
3. Personal handouts for R coding in data science.
4. Published academic papers and industrial news/reports.
5. Machine Learning with R: Expert techniques for predictive modeling, 3rd Edition, by Brett Lantz, Packt Publishing.  
評量方式
(僅供參考)
 
No.
項目
百分比
說明
1. 
Take-Home Assignment 
60% 
4 times (each 15%)  
2. 
Midterm Exam 
15% 
Multiple-choice problem + scenario simulation 
3. 
Final Exam 
15% 
Multiple-choice problem + scenario simulation 
4. 
Project Simulation 
10% 
Following a PDCA loop 
 
課程進度
週次
日期
單元主題
第1週
9/12  Introduction to data science and the top 10 algorithms 
第2週
9/19  Overview of statistics and R programming  
第3週
9/26  Data processing (outlier detection, Chi-square test, proportion test)  
第4週
10/03  Statistical analysis (one-tail/two-tail T-test, ANOVA, regression) 
第5週
10/10  National Day Holiday (No Class) 
第6週
10/17  Clustering (K-means, K-medoids, C-means) 
第7週
10/24  Clustering (Gaussian mixture modeling, hierarchical clustering, DBSCAN) 
第8週
10/31  Association (Apriori algorithm) 
第9週
11/07  Basic classifiers (KNN, Naïve Bayes, Logit/Probit regression) /HW2 Due 
第10週
11/14  Basic classifiers (KNN, Naïve Bayes, Logit/Probit regression) 
第11週
11/21  Midterm Exam 
第12週
11/28  Ensemble learning (random forest, bagging, boosting) 
第13週
12/05  Advanced classifiers (support vector machine, neural network) /HW3 due 
第14週
12/12  Machine-learning based regression (support vector machine , neural network, random forest)  
第15週
12/19  Statistical regression (MLR, MARS, PLS)  
第16週
12/26  Biased regression (Ridge, Lasso, Ellastic Net) 
第17週
1/02  Final Exam